Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            Previous work has benchmarked multiple speech recognition systems in terms of Word Error Rate (WER) for speech intended for artificial agents. This metric allows us to compare recognizers in terms of the frequency of errors, however errors are not equally meaningful in terms of their impact on understanding the utterance and generating a coherent response. We investigate how the actual recognition results of 10 different speech recognizers and models result in response appropriateness for a virtual human (Sergeant Blackwell), who was part of a museum exhibit, fielding questions ”in the wild” from museum visitors. Results show a general correlation between WER and response quality, but this pattern doesn’t hold for all recognizers.more » « less
- 
            Lücking, Andy; Mazzocconi, Chiara; Verdonik, Darinka (Ed.)
- 
            Abstract We present the first spoken dialogue system for the Choctaw language in this paper. Choctaw is an endangered American indigenous language spoken by the Choctaw tribe. Previous work in this area created a text-based English-Choctaw bilingual chatbot, named Masheli, that primarily shared stories about animals. Ad- ditional work developed an automatic speech recognizer (ASR) to process spoken Choctaw. In this paper, we demo the Choctaw ASR together with the Masheli chat- bot to form a dialogue system that allows the user to speak, rather than type, to the system. As the language is endangered, a spoken dialogue system would assist revitalization efforts by promoting oral fluency in language learners.more » « less
- 
            In this paper, we compare two different approaches to language understanding for a human-robot interaction domain in which a human commander gives navigation instructions to a robot. We contrast a relevance-based classifier with a GPT-2 model, using about 2000 input-output examples as training data. With this level of training data, the relevance-based model outperforms the GPT-2 based model 79% to 8%. We also present a taxonomy of types of errors made by each model, indicating that they have somewhat different strengths and weaknesses, so we also examine the potential for a combined model.more » « less
- 
            We evaluate several publicly available off-the-shelf (commercial and research) automatic speech recognition (ASR) systems on dialogue agent-directed English speech from speakers with General American vs. non-American accents. Our results show that the performance of the ASR systems for non-American accents is considerably worse than for General American accents. Depending on the recognizer, the absolute difference in performance between General American accents and all non-American accents combined can vary approximately from 2% to 12%, with relative differences varying approximately between 16% and 49%. This drop in performance becomes even larger when we consider specific categories of non-American accents indicating a need for more diligent collection of and training on non-native English speaker data in order to narrow this performance gap. There are performance differences across ASR systems, and while the same general pattern holds, with more errors for non-American accents, there are some accents for which the best recognizer is different than in the overall case. We expect these results to be useful for dialogue system designers in developing more robust inclusive dialogue systems, and for ASR providers in taking into account performance requirements for different accents.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                     Full Text Available
                                                Full Text Available